perm filename GEM[0,BGB]2 blob sn#077642 filedate 1973-12-13 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00014 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	3.0	GEOMETRIC MODELING THEORY.
C00003 00003	Definitions: World, World Modeling, Representation, Simulation.
C00006 00004	Kinds of World Modeling in Artificial Intelligence.
C00007 00005	Five kinds of geometric models.
C00008 00006	The geometric model has several roles: 
C00010 00007	Geometric Modeling.
C00017 00008	Camera Model.
C00021 00009	The Vertex.
C00025 00010	The Edge.
C00028 00011	The Face.
C00031 00012	Polyhedra, Bodies and Objects.
C00035 00013	Three kinds of perimeter.
C00037 00014	BFEV Accessing.
C00041 ENDMK
C⊗;
3.0	GEOMETRIC MODELING THEORY.

Introduction to Geometric Modeling.

	The  notion  of  a  "geometric  model"  for  surveying  land,
predates  the idea  of  formal Geometry.   The  very  word "geometry"
literally describes the task of  a surveyor, since the syllables mean
"earth  measure" in  Greek.   Before Euclid,   geometry  consisted of
manual methods for measuring and modeling entities in the real world.
Likewise,  the kind  of geometry in this chapter  is "pre-Euclidean",
a theory  composed of techniques empirically found  by practicing the
art on a digital computer.
	
Definitions: World, World Modeling, Representation, Simulation.

	In   the   context  of   computer   vision   and   artificial
intelligence,     "world  modeling"  refers   to  "representing"  and
"simulating" a  "world" on a  computer so  that the  course of  world
events  can be  perceived,   predicted and  altered. Practical  world
modeling  issues involve which  aspects of the  world are significant
and worth  modeling; which programming  technologies should be  used;
and what are the trade offs between storing the model and recomputing
the model. With respect to the last issue, the term  "representation"
indicates more concern with data structure and time independent features
while the  term  "simulation" indicates concern with process and dynamic
situations.

	In an abstract definition, "world  modeling" is the third
of  four  nested universal  Turing  machines.   The  outermost Turing
machine is  called  the real  world;  the  second Turing  Machine  is
simulated on the world machine and may be called the robot; the third
Turing machine  is simulated on the robot's machine and is called the
robot's world model.  The fourth Turing  machine is the robot's  self
model, at  which point the recursion  can be stopped  for the present
discussion. Although the nesting allows for self perception and  self
manipulation,   we will  assume for  the following  that the  robot's
hardware can be isolated from the real world so that it is just metal
box.

	The world is something that has states that can be changed.
From the inside of an intellectual entity it appears as if
the overwhealming majority of world states change spontaneously,
(or by means of the mind of God, if you are a Berkelian theist);
and the necessarily few states are controled.
Kinds of World Modeling in Artificial Intelligence.

	1. Geometric
	2. Semantic
	3. Logical
	4. Procedural

Five kinds of geometric models.

	1. 3-D Space Array.
	2. 2-D Surface Function Z=F(x,y).
	3. Manifolds: Body, Face, Edge, Vertex.
	4. Spine Cross Section.
	5. Cellular.

The geometric model has several roles: 

	1. Synthesis source for verification.
	2. Analysis target for revelation.
	3. Domain,
		for problem solving,
		for spatial visualization,
		for semantic definition (noun concepts),
		for recognition feature classification,
		for planning.

	The theory;
Assumption: The vision world model should be three dimensional.
The vision world model should be able to represent
surfaces of opaque objects.

	In order to get a computer to deal with the physical world it
must  have  a  data  representation  on  which computations involving
space, time, shape, size, and the appearance of things can  be  done.
Geometric Modeling.

	I  will introduce my requirements for a computer model of the
physical world in terms of its role as a memory.    As  a  memory,  a
world  model  has  contents and an addressing mechanism. The kinds of
data that I wish to hold in my world model are:

	CONTENT REQUIREMENTS
		1. Topological data.
		2. Geometric data.
		3. Photometric data.
		4. Parts tree data.

	Topological data has to do with the notion of neighborhood; a
world model has data on what is next to what. A  face,  edge,  vertex
model is essentially dedicated to surface topology; matters of volume
topology are not stored but rather must be computed.  Geometric  data
has  to  do  with  notions  such  as  locus, length, area and volume.
Photometric data includes the locus and nature of light  sources,  as
well as data on how surfaces reflect, absorb and scatter light. Parts
tree data has to do with the notion  that  objects  are  composed  of
parts,  which  I  construe  as  data on the structure of the physical
world rather than as purely an artifact of  having  structured  world
data; that is, I prefer to have the specification of how an entity is
broken into parts be external to my world model. The  kinds  of  data
not included are semantic data (other than body names); physical data
such as mass, inertia tensors, electrical properties and so  on;  and
cultural  data  such  as whether an object is a toy, tool, or weapon;
with any artistic, religious or market value.

	Next  the  kinds of addressing mechanisms I wish to have, (or
equivalently the input-output modes of the model) are:

	ACCESSING REQUIREMENTS
		1. Appearance - given a camera,  return an image of 
		   what the world would look like from that camera.
		2. Recognition - given an image, return the objects
		   from the world model  that appear in that image.
		3. Camera Solution  -  given  a  recognized  image,
		   find  the  location & orientation of the camera.
		4. Perception - given images,  from solved cameras,
		   place new bodies  into the model for portions of
		   the images  that  have not  yet been recognized.
		5. Spatial Accessing  -  given a locus  and radius,
		   return the portions  of objects  in that sphere.

Clearly,  these  are  the high level accessing requirements which are
the reasons for having a world model and the design goals  for  model
building.
Camera Model.

	As the accessing requirements imply, a world model requires a
special  entity  called  a  camera  which  is  used  to  model  image
formation. Although the camera model is important here for a complete
specification  of the data, it may be skipped on a first reading. The
particular camera model I have been using  lately,  is  expressed  by
eighteen  real numbers involving nine degrees of freedom. First there
is the camera lens center locus:

		CX, CY, CZ,	in world coordinates.

Afixed to the lens center is the camera frame of reference with  unit
vectors  i, j and k. When the image formed by the camera is placed in
correspondence to a display screen, as illustrated in figure 1.3, the
unit  vector  i  maps  into  the  rightward positive x of the display
screen, and the unit vector j maps into the upward positive y of  the
display screen, and the unit vector k comes out of the display screen
to form a right handed coordinate system.  Together  the  three  unit
vectors of the camera are the three by three rotation matrix:

		IX, IY, IZ
		JX, JY, JZ	in world coordinates.
		KX, KY, KZ

Next,  there  are  three  scales  which  determine the correspondence
between world size and image size. Observe that the world is measured
in physical units of length like meters or feet while computer images
come in integral sizes like 1024 by 1024 or 480 rows by 512  columns,
thus  the  conversion  scales must be in terms of logical image units
per physical world units.   In an actual television camera  a  minute
image  (say  9mm  by 12mm) is formed on a vidicon tube and that image
has a particular number of rows and columns. It is the  little  image
on the vidicon that we pretend to model by the six parameters:

		LDX, LDY, LDZ		Logical raster size.
		PDX, PDY, FOCAL		Physical raster size.

Where the number named FOCAL, is the focal plane  distance  which  in
this model (with distant objects) can safely be equated with the lens
focal length and can be given in millimeters (conventional  lens  run
12.5mm  to  75mm for 1" TV).   The integer LDZ is an artifact so that
the units come out correctly in the  Z  dimension.  Thus  the  scales
factors are defined:

		SCALEX ← -FOCAL*LDX/PDX;
		SCALEY ← -FOCAL*LDY/PDY;
		SCALEZ ←  FOCAL*LDZ;

	This  simple  camera  model  is  used to compute vertex image
coordinates.   A more elaborate physical camera model can be found in
Sobel [reference 9].
The Vertex.

	A  vertex  is  a labeled point in a Euclidean three
space.   The important thing about a vertex is its world locus  (with
component names XWC,YWC,ZWC standing for world-coordinates).   Vertex
locii are the basic geometric data  from which length, area,  volume,
face  vectors  and image positions can be computed. Also a Vertex may
exist simultaneously in one or more  image  spaces.  An  image  space
(with component names XPP,YPP,ZPP standing for perspective-projected)
is always three dimensional and is determined with respect to a given
camera  centered  coordinate system (with component names XCC,YCC,ZCC
standing for camera-coordinates).    The third image component,  ZPP,
is  taken  inversely  proportional to the distance of the vertex from
the camera image  plane,  ZCC.  Using  the  camera  of  the  previous
section.  The  transformation  of  a  vertex  world locus to a camera
centered locus is:

			X ← XWC - CX;
			Y ← YWC - CY;
			Z ← ZWC - CZ;

			XCC ← IX*X + IY*Y + IZ*Z;
			YCC ← JX*X + JY*Y + JZ*Z;
			ZCC ← KX*X + KY*Y + KZ*Z;

	The first three assignment statements are the translation  to
the  camera  frame's origin,  the  second  three  assignments are the
rotation to the camera frame's orientation.    Next  the  perspective
projection transformation is computed:

			XPP ←  SCALEX*XCC/ZCC;
			YPP ←  SCALEY*YCC/ZCC;
			ZPP ←  SCALEZ    /ZCC;

	The XPP and YPP assignments are derived by means  of  similar
triangles,  as  is  being  done  by  the  man  in figure 1.5; the Zpp
assignment  is  for  preserving  the  depth   information   and   the
colinearity of the world in the perspective projected image space. As
given, the PP frame is right handed and  vertices  in  front  of  the
camera's  image  plane will have negative Zpp; Zpp values near -FOCAL
are close to the camera and values approaching zero are far away.

	A final matter with respect to vertices is their valence. The
valence of a vertex is the number of edges that meet at the vertex. A
vertex valence of three, for example, indicates a trihedral corner.
The Edge.

	For  a  start, the structure of an edge need be thought of as
little more than two vertices; the topological subtlety of edges will
be  explained  later.  However,  two vertices do define the important
geometric edge data called the 2D line coefficients.  Named  AA,  BB,
CC; these coefficients are computed from the perspective locus of the
edge's endpoints as follows:

                        AA ← Y1 - Y2;
                        BB ← X2 - X1;
                        CC ← X1*Y2 - X2*Y1;

These coefficients appear  in  the  2D  equation  of  the  line  that
contains the edge:
			0 = AA*X + BB*Y + CC;

When the edge coefficients are normalized:

			L   ← SQRT(AA↑2+BB↑2);
			AA  ← AA/L;
			BB  ← BB/L;
			CC  ← CC/L;

the line equation gives the distance, of a point X,Y from the line:

			Q ← AA*X + BB*Y + CC;

The distance is actually ABS(Q), since Q is negative on one side side
of  the  line;  also if one were standing on the plane at point X1,Y1
facing X2,Y2 the Q positive half-plane would be on your left and  the
Q negative half plane would be on your right.

	An  important  operation on two edges is to detect whether or
not they intersect; this can be decided by checking first whether the
endpoints  of  one  edge are in the opposite half planes of the other
edge, and second whether the endpoints of the latter edge are in  the
opposite half planes of the first.  When both conditions obtain, then
the intersection point can be found:

			T ← (A1*B2 - A2*B1);
			X ← (B1*C2 - B2*C1)/T;
			Y ← (A2*C1 - A1*C2)/T;

An  actual  compare  for  intersection should initially check for the
identity case, and for edges with a vertex in common.
The Face.

	A face is a finite  region  of  plane  enclosed  by  straight
lines.   A  safe  formal  face structure could be built by defining a
triangle as three non-colinear vertices and then insisting  that  all
faces  be  triangle  interiors.   Unhappily,  BFEV  faces are usually
represented as a list of vertices and edges (or by  something  nearly
equivalent)  for  the sake of saving memory space.  Such `list' faces
are not monolithic but tend to suffer special cases  and  pathologies
such as:
		Coincident or crossing edges,
		Holes and Disjointness,
		Concavity (& Convexity),
		Non-coplanarity.

Like edges, faces have characteristic coefficients. Face coefficients
satisfy the equation of a plane in which the face is embedded:

	AA*X + BB*Y + CC*ZZ = KK.

The equation could be divided by KK, but that is undesirable  because
the AA, BB, CC are more useful as a unit normal vector, in which case
KK is the distance of the origin from the plane.  Given the locii  of
three  non-colinear  vertices, the  coefficients  of  a  plane can be
computed by Kramer's rule as follows:

	KK    ←   X1*(Z2*Y3-Y2*Z3) 
		+ Y1*(X2*Z3-Z2*X3) 
		+ Z1*(Y2*X3-X2*Y3);
	AA    ← (Z1*(Y2-Y3) + Z2*(Y3-Y1) + Z3*(Y1-Y2));
	BB    ← (X1*(Z2-Z3) + X2*(Z3-Z1) + X3*(Z1-Z2));
	CC    ← (X1*(Y3-Y2) + X2*(Y1-Y3) + X3*(Y2-Y1));

and normalized:

	ABC ← SQRT(AA↑2 + BB↑2 + CC↑2);
	AA  ← AA/ABC;
	BB  ← BB/ABC;
	CC  ← CC/ABC;
	KK  ← KK/ABC;
	
If  the  given  vertices  V1,  V2,  V3  had  been taken going counter
clockwise about the face as viewed from the exterior  of  the  solid,
then the following relations obtain:

	AA*X + BB*Y + CC*Z > KK implies X,Y,Z above the plane.
	AA*X + BB*Y + CC*Z = KK implies X,Y,Z in the plane.
	AA*X + BB*Y + CC*Z < KK implies X,Y,Z below the plane.

Face coefficients prove useful in both world and image coordinates.
Polyhedra, Bodies and Objects.

	In  elementary  geometry,  a polyhedron is said to be a solid
formed (or bounded) by plane faces; the word  "polyhedron"  literally
meaning  "many-faced".      Topologically,  simple  polyhedra satisfy
Euler's F-E+V=2 equation; where F, E and V are the number  of  faces,
edges  and vertices of the polyhedron respectively. This equation was
known to Descartes in 1640, but the first proof  wasn't  given  until
1752,  when  Euler  proved  the  relation  by  considering  the graph
corresponding to the edges of polyhedra.  A simple polyhedron is  one
homeomorphic to a sphere. The rigorous development of volume measure,
and in turn `solid' polyhedra, is not simple; thus it has been easier
to  take  the  topological  notion  F-E+V=2  as  the  more  primitive
definition of a polyhedron on which to base a data structure  and  to
proceed  towards  the  appearance  of  `solidness'  which  is  a more
complicated notion.

	Counter to the usual usuage, I define the word "body" to mean
an  entity  more  specific  than  a polyhedron; the idea being that a
polyhedron is represented by the whole structure  of  bodies,  faces,
edges  and vertices. Bodies may have location, orientation and volume
in space. Bodies may be conected to faces, edges and vertices,  which
may  or  may  not  form a complete polyhedron.  It is typical to have
only one body to a polyhedron when representing a rigid object like a
sledge  hammer and several bodies to a polyhedron when representing a
flexible object like a man. Furthermore, the body concept is used  to
handle  the  notion  of parts and abstract regional objects such as a
parking  lot.      For  example,  the  Stanford  AI  Parking  Lot  is
represented  by  a  body that has three parts:  the Near, Mid and Far
Lots. The Near Lot then has aisles and lanes and lamp islands; a lamp
island  has  a  curb  and two lamps; a lamp has a base, stem and top.
This parts structure is carried in body nodes.    Finally,  the  word
"object"  will  be  used  to  refer  to  physical  objects  such as a
redwood-tree, building, or roadway.
Three kinds of perimeter.

Figure 1.6
FACE PERIMETER - a face is surrounded by edges and vertices.

                     VERTEX  
		        ⊗
		       / \
                      /   \
                     /     \
                    /       \
             EDGE  /         \  EDGE
                  /           \
                 /    FACE     \
                /               \
               /                 \
              /                   \
      VERTEX ⊗---------------------⊗ VERTEX
		      EDGE

Figure 1.7
VERTEX PERIMETER - a vertex is surrounded by edges and faces.

		      EDGE
			|
			|
	      FACE	|      FACE
			|
			⊗ VERTEX
	               / \
	              /   \
	             /     \
	            /       \
		   /  FACE   \
	       EDGE    	      EDGE

Figure 1.8
EDGE PERIMETER - an edge is surrounded by 2 faces & 2 vertices.

		     VERTEX

			⊗
			|
			|
		FACE	E    FACE
			|
			|
			⊗
		
		     VERTEX
BFEV Accessing.

	1. Accessing by name and serial number.
	2. Parts-Tree Accessing.
	3. FEV Sequential Accessing.
	4. FEV Perimeter Accessing.

	A   BFEV  model  has  four  kinds  of  accessing.   The  most
conventional BFEV access is retrieval by symbolic name which requires
a  symbol  table. Next, between bodies there is Parts-Tree accessing.
At the top of the Parts-Tree is a special body  named  the  world  to
which  all  the other bodies are attached; thus the world body serves
as an OBLIST node. Given a particular body, a list of  its  sub-parts
can  be  retrieved as well as its supra-part or "supart". A supart is
the whole entity to which a part belongs, the  world  being  its  own
supart.

	Within  each  body  there is face, edge and vertex sequential
accessing. Given a body, all its faces, or edges, or vertices need to
be  readily available since perspective projection loops thru all the
vertices, and the process of display  clipping  loops  thru  all  the
edges,  and  the act of checking for body intersection loops thru all
the faces. In LISP, one might provide  FEV  sequential  accessing  by
placing  a  list  of faces, a list of edges and a list of vertices on
the property list of each body, so that a cube might  be  represented
as:

	(DEFPROP CUBE (F1 F2 F3 F4 F5 F6) FACES)
	(DEFPROP CUBE (E1 E2 E3 E4 E5 E6 E7 E8 E9 E10 E11 E12)EDGES)
	(DEFPROP CUBE (V1 V2 V3 V4 V5 V6 V7 V8) VERTICES)

	Finally,  among the faces, edges and vertices of a body there
is perimeter accessing. Faces have a perimeter of edges and  vertices
[figure  1.6]; less commonly used, vertices have a perimeter of edges
and faces  [figure  1.7];  and  of  particular  note,  edges  have  a
perimeter  always formed by two faces and two vertices, [figure 1.8].
Perimeter accessing requires that given a face, edge or vertex,  that
the perimeter of that entity be readily accessible. Since the surface
of a polyhedron is orientable, that is has a well defined inside  and
outside,  (Klein  bottles  with their crosscaps will not be modeled),
such perimeter lists can be ordered (say clockwise) with  respect  to
the  exterior  of the polyhedron. Perimeter accessing is mentioned in
Guzman [reference 6] and Falk [reference 4]  and  is  the  underlying
basis  of  part-II  of  this  paper which presents a polyhedron model
built for accessing and altering face, edge and vertex perimeters.